112 research outputs found
Some statistical and computational challenges, and opportunities in astronomy
The data complexity and volume of astronomical findings have increased in recent decades due to major technological improvements in instrumentation and data collection methods. The contemporary astronomer is flooded with terabytes of raw data that produce enormous multidimensional catalogs of objects (stars, galaxies, quasars, etc.) numbering in the billions, with hundreds of measured numbers for each object. The astronomical community thus faces a key task: to enable efficient and objective scientific exploitation of enormous multifaceted data sets and the complex links between data and astrophysical theory. In recognition of this task, the National Virtual Observatory (NVO) initiative recently emerged to federate numerous large digital sky archives, and to develop tools to explore and understand these vast volumes of data. The effective use of such integrated massive data sets presents a variety of new challenging statistical and algorithmic problems that require methodological advances. An interdisciplinary team of statisticians, astronomers and computer scientists from The Pennsylvania State University, California Institute of Technology and Carnegie Mellon University is developing statistical methodology for the NVO. A brief glimpse into the Virtual Observatory and the work of the Penn State-led team is provided here
Using R-based VOStat as a Low-Resolution Spectrum Analysis Tool
We describe here an online software suite VOStat written mainly for the Virtual Observatory, a novel structure in which astronomers share terabyte scale data. Written mostly in the public-domain statistical computing language and environment R, it can do a variety of statistical analysis on multidimensional, multi-epoch data with errors. Included are techniques which allow astronomers to start with multi-color data in the form of low-resolution spectra and select special kinds of sources in a variety of ways including color outliers. Here we describe the tool and demonstrate it with an example from Palomar-QUEST, a synoptic sky survey.
Using R-based VOStat as a low resolution spectrum analysis tool
We describe here an online software suite VOStat written mainly for the Virtual Observatory, a novel structure in which astronomers share terabyte scale data. Written mostly in the public-domain statistical computing language and environment R, it can do a variety of statistical analysis on multidimensional, multi-epoch data with errors.
Included are techniques which allow astronomers to start with multi-color data in the form of low-resolution spectra and select special kinds of sources in a variety of ways including color outliers. Here we describe the tool and demonstrate it with an example from Palomar-QUEST, a synoptic sky survey
Edgeworth expansions for errors-in-variables models
AbstractEdgeworth expansions for sums of independent but not identically distributed multivariate random vectors are established. The results are applied to get valid Edgeworth expansions for estimates of regression parameters in linear errors-in-variable models. The expansions for studentized versions are also developed. Further, Edgeworth expansions for the corresponding bootstrapped statistics are obtained. Using these expansions, the bootstrap distribution is shown to approximate the sampling distribution of the studentized estimators, better than the classical normal approximation
Limit theorems for functions of marginal quantiles
Multivariate distributions are explored using the joint distributions of
marginal sample quantiles. Limit theory for the mean of a function of order
statistics is presented. The results include a multivariate central limit
theorem and a strong law of large numbers. A result similar to Bahadur's
representation of quantiles is established for the mean of a function of the
marginal quantiles. In particular, it is shown that
as , where is a constant and are
i.i.d. random variables for each . This leads to the central limit theorem.
Weak convergence to a Gaussian process using equicontinuity of functions is
indicated. The results are established under very general conditions. These
conditions are shown to be satisfied in many commonly occurring situations.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ287 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
- …